-
Notifications
You must be signed in to change notification settings - Fork 2.5k
Prevent adding duplicate files to the system prompt #7657
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Prevent adding duplicate files to the system prompt #7657
Conversation
…resolved paths (symlinks) and md5 sum.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for your contribution! This PR successfully addresses the duplicate files issue in the system prompt. The implementation using a FileTracker with both MD5 content hashing and resolved path tracking is solid and well-tested. I've left some suggestions inline that could improve performance and maintainability.
|
Looks like the build fails because Node's "crypto" is not in the browser context. I'm not sure how to handle that without adding a dependency like https://www.npmjs.com/package/crypto-browserify - what's the preference? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey, regarding the crypto module issue you mentioned - the problem is that src/shared/modes.ts imports addCustomInstructions from custom-instructions.ts at the top level (line 14).
During the webview build, Vite analyzes all imports in shared modules and tries to bundle them. Since WebviewMessage.ts imports the Mode type from ./modes, Vite processes the entire shared/modes.ts file including all its imports. This pulls in custom-instructions.ts which uses Node.js-only modules (fs/promises and now crypto) that don't exist in browsers.
The issue existed before with fs/promises (which Vite was externalizing with warnings), but adding crypto makes it worse and causes actual build failures.
So should crypto be replaced with a pure JS implementation or do you have some other preference? |
|
I think the fix has two parts. For this PR, let's focus on the
This approach means we won't need a polyfill like Also, probably a good idea to update the test names from "MD5" to something more generic like "content hash" so we're not tied to a specific algorithm. |
|
Thanks for the suggestions @daniel-lxs , I removed crypto and added native FNV-1a as you proposed so there are no new dependencies. |
|
I don't think this PR will solve issue #6604 since the injection of the file comes from the CLI itself, outside our logic to load files. I'll close this PR for now but let me know if something like this makes sense. |
Related GitHub Issue
Closes: #6604
Roo Code Task Context (Optional)
https://app.roocode.com/share/b11fe4ae-11cc-4e1d-b944-c47551eebe35
Description
Roo Code has many possible files it can use to generate the system prompt. If any of these files are symlinks to the same file or just a complete copy, Roo will still add them and so you end up with a bloated system prompt from duplicates.
Example:
Test Procedure
Pre-Submission Checklist
Screenshots / Videos
Documentation Updates
Does this PR necessitate updates to user-facing documentation?
Subjectively, this wasn't documented before and there is no downside to de-duplicating the prompts so not sure it needs to be documented..
Additional Notes
A more advanced method of detecting large nearly duplicate files and warning the user might be useful for some. For me, just not duplicating my rather large AGENTS.md file is a huge win.
Get in Touch
colin3475
Important
This PR prevents duplicate files in the system prompt by using a
FileTrackerto check for duplicate content and resolved paths, with tests added for validation.custom-instructions.ts.FileTrackerinterface andshouldIncludeFile()function to track and check duplicates.loadRuleFiles()andreadTextFilesFromDirectory()to useFileTracker.custom-instructions.spec.tsfor deduplication of files with identical content and symlinks.This description was created by
for a661d39. You can customize this summary. It will automatically update as commits are pushed.